Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apple silicon has 128 byte alignment so fix our defines to match #52996

Merged
merged 1 commit into from
Jan 24, 2024

Conversation

gbaraldi
Copy link
Member

julia/base/lock.jl

Lines 33 to 50 in 8a69745

mutable struct ReentrantLock <: AbstractLock
# offset = 16
@atomic locked_by::Union{Task, Nothing}
# offset32 = 20, offset64 = 24
reentrancy_cnt::UInt32
# offset32 = 24, offset64 = 28
@atomic havelock::UInt8 # 0x0 = none, 0x1 = lock, 0x2 = conflict
# offset32 = 28, offset64 = 32
cond_wait::ThreadSynchronizer # 2 words
# offset32 = 36, offset64 = 48
# sizeof32 = 20, sizeof64 = 32
# now add padding to make this a full cache line to minimize false sharing between objects
_::NTuple{Int === Int32 ? 2 : 3, Int}
# offset32 = 44, offset64 = 72 == sizeof+offset
# sizeof32 = 28, sizeof64 = 56
ReentrantLock() = new(nothing, 0x0000_0000, 0x00, ThreadSynchronizer())
end
this probably also needs a fix, and maybe other places as well

Copy link
Sponsor Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also review the uses of malloc_cache_align? This particular field seems to be intended to be the largest sized value that can be loaded in a single instruction, or at least whatever alignment is most optimal for that (in bytes)?

@gbaraldi
Copy link
Member Author

I guess we should review these things overall. I'm not sure if it has anything to do with specific instructions because both 64 byte alignment and loads are way to large to matter to instructions and seem to do with caches.

@gbaraldi
Copy link
Member Author

I think the goal with malloc_cache_align is to make sure that different mallocs don't end up in the same cache line, so the large alignment asked here makes some sense.

Copy link
Member

@d-netto d-netto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will probably be useful for #52994.

Can we prioritize merging it and backporting it to 1.10?

Thanks.

@d-netto d-netto added the backport 1.10 Change should be backported to the 1.10 release label Jan 22, 2024
@d-netto
Copy link
Member

d-netto commented Jan 24, 2024

Some versions of PPC apparently also have a cache line size of 128 bytes: https://reviews.llvm.org/D33656.

I don't know what's the status of supporting PPC, but it might be a good idea to adjust it as well.

@KristofferC KristofferC mentioned this pull request Jan 24, 2024
33 tasks
@KristofferC KristofferC merged commit 91ec2bb into master Jan 24, 2024
8 checks passed
@KristofferC KristofferC deleted the gb/m1-alignment branch January 24, 2024 13:20
@vchuravy
Copy link
Member

We only support PPC8+ so yeah wes should adjust for that as well l.

KristofferC pushed a commit that referenced this pull request Jan 24, 2024
KristofferC added a commit that referenced this pull request Feb 6, 2024
Backported PRs:
- [x] #51095 <!-- Fix edge cases where inexact conversions to UInt don't
throw -->
- [x] #52583 <!-- Don't access parent of triangular matrix in powm -->
- [x] #52645 <!-- update --gcthreads section in command line options -->
- [x] #52423 <!-- update nthreads info in versioninfo -->
- [x] #52721 <!-- inference: Guard TypeVar special case against vararg
-->
- [x] #52637 <!-- fix finding bundled stdlibs even if they are e.g.
devved in an environment higher in the load path -->
- [x] #52752 <!-- staticdata: handle cycles in datatypes -->
- [x] #52758 <!-- use a Dict instead of an IdDict for caching of the
`cwstring` for Windows env variables -->
- [x] #51375 <!-- Insert hardcoded backlinks to stdlib doc pages -->
- [x] #52994 <!-- place work-stealing queue indices on different cache
lines to avoid false-sharing -->
- [x] #53015 <!-- Add type assertion in iterate for logicalindex -->
- [x] #53032 <!-- Fix a list in GC devdocs -->
- [x] #52748 
- [x] #52856 
- [x] #52878
- [x] #52754 
- [x] #52228
- [x] #52924
- [x] #52569 <!-- Fix GC rooting during rehashing of iddict -->
- [x] #52605 <!-- Default uplo in symmetric/hermitian -->
- [x] #52618 <!-- heap snapshot: add gc roots and gc finalist roots to
fix unrooted nodes -->
- [x] #52781 <!-- fix type-stability bugs in Ryu code -->
- [x] #53055 <!-- Profile: use full terminal cols to show function name
-->
- [x] #53096 
- [x] #53076 
- [x] #52841 <!-- Extensions: make loading of extensions independent of
what packages are in the sysimage -->
- [x] #52078 <!-- Replace `&hArr;` by `&harr;` in documentation -->
- [x] #53035 <!-- use proper cache-line size variable in work-stealing
queue -->
- [x] #53066 <!-- doc: replace harr HTML entity by unicode -->
- [x] #52996 <!-- Apple silicon has 128 byte alignment so fix our
defines to match -->
- [x] #53121 

Non-merged PRs with backport label:
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC removed the backport 1.10 Change should be backported to the 1.10 release label Feb 6, 2024
Drvi pushed a commit to RelationalAI/julia that referenced this pull request Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants